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Abstract 

We consider the problem of partial order production: arrange the elements of an unknown 
{Sj totally ordered set T into a target partially ordered set S, by comparing a minimum number 

q of pairs in T. Special cases include sorting by comparisons, selection, multiple selection, and 

D heap construction. 

Q We give an algorithm performing ITLB + o{ITLB) +0{n) comparisons in the worst case. 

7— I Here, n denotes the size of the ground sets, and ITLB denotes a natural information-theoretic 

lower bound on the number of comparisons needed to produce the target partial order. 
C/3 Our approach is to replace the target partial order by a weak order (that is, a partial 

Q order with a layered structure) extending it, without increasing the information theoretic 

c/3 lower bound too much. We then solve the problem by applying an efficient multiple selection 

i algorithm. The overall complexity of our algorithm is polynomial. This answers a question 

of Yao (SIAM J. Comput. 18, 1989). 

We base our analysis on the entropy of the target partial order, a quantity that can be 
£\j efficiently computed and provides a good estimate of the information-theoretic lower bound. 

I> 

Keywords: Partial order, graph entropy 

1 Introduction 

o 

j> We consider the Partial Order Production problem: 

Given a set S = {si, ■ ■ ■ , s n } partially ordered by a known partial order ^ and a set T = 
{ti,t 2 , . . . , t n } totally ordered by an unknown linear order ^, find a permutation n of {1, 2, . . . , n} 
such that Si ^ Sj =^> t n (i) ^ t^{j), by asking questions of the form: "is ti ^ tj ?". 

The Partial Order Production problem generalizes many fundamental problems (see 
Figure [T]), corresponding to specific families of posets P := (S,=$). It amounts to sorting by 
comparisons when P is a chain. The selection [17] and multiple selection [8] problems are special 
cases in which P is a weak ordeiR that is, has a layered structure (with a ^ b iff a is on a lower 
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Figure 1: Special cases of the Partial Order Production problem. 

layer than b). When the Hasse diagram of P is a complete binary tree, the problem boils down to 
heap construction [6]. 

We assume that the target poset P is part of the input and represented by its Hasse diagram. 
Hence the size of the input can be f2(n 2 ), whereas sorting the n elements of T takes O(nlogn) 
time. In other words, reading the input could take more time than necessary to solve problem, 
provided a topological sorting of P is known. 

To cope with this paradoxical situation, we consider algorithms that proceed in two phases: a 
preprocessing phase during which an ordering strategy is determined (for instance, in the form of a 
decision tree, or any more efficient description, if possible), on the basis of the structure of P, and 
an ordering phase during which all comparisons between elements of T are performed. Accordingly, 
we distinguish the preprocessing complexity and the ordering complexity of the algorithm, the latter 
being essentially proportional to the number of comparisons performed. 

As noted before, we expect the overall complexity of the algorithm to be dominated by its 
preprocessing complexity. Thus it is desirable to perform the preprocessing phase only once, and 
then use the resulting ordering strategy on several data sets. 



Lower bound on the number of comparisons We denote by e(P) the number of linear 
extensions of the target poset P. Feasible permutations ir are in one-to-one correspondence with 
the linear extensions of P, thus the number of feasible permutations is exactly e(P). On the other 
hand, the total number of permutations is n\. We have thus the following information-theoretic 
lower bound (logarithms are base 2): 

Theorem 1 ([1,28,30]). Any algorithm solving the PARTIAL ORDER PRODUCTION problem for 
an n-element poset P requires 

IT LB : = logn! - loge(P) 
comparisons between elements of T in the worst case and on average. 

Note that we can assume without loss of generality that P is connected, hence we also have a 
lower bound of n — 1. 



Problem history and contribution The PARTIAL ORDER PRODUCTION problem was first 
proposed in 1976 by Schonhage [28]. It was studied five years after by Aigner [1]. Another four 
years passed and the problem simultaneously appeared in two survey papers: one by Saks [27] 
and the other by Bollobas and Hell [2]. In his survey, Saks conjectured that the Partial Order 
PRODUCTION problem can be solved by performing O(ITLB) + 0(n) comparisons in the worst 
case. 

Four years later, in 1989, Yao proved Saks' conjecture [30]. He gave an algorithm solving the 
Partial Order Production problem in at most c 1 ITLB+c 2 n comparisons, for some constants 
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ci and C2- However, the preprocessing phase of Yao's algorithm seems difficult to implement 
efficiently. In fact, in the last section of his paper [30], Yao asked whether, assuming P is part 
of the input (as is the case here), there exists a polynomial-time algorithm for the problem that 
performs O(ITLB) + 0(n) comparisons. 

Our main contribution is an algorithm that solves the Partial Order Production problem 
and performs at most ITLB + o(ITLB) + 0(n) comparisons in the worst case. The preprocessing 
complexity of our algorithm is 0(n 3 ). Hence we answer affirmatively the question of Yao [30] 
mentioned above. Moreover, we also significantly improve the ordering complexity, since Yao's 
constants C\ and C2 are quite large. 

Further references, focussing mainly on lower bounds for the problem and generalizations of it 
include Culberson and Rawlins [12], Chen [9] and Carlsson and Chen [7]. 

Main ideas underlying our approach We reduce the Partial Order Production problem 
to the multiple selection problem. Instead of solving the problem for the given target poset P we 
solve it for a larger (more constrained) poset that has a simpler structure, namely, a weak order W 
extending P (a weak order is a set of antichains with a total ordering between these antichains). 
This approach works because, as we show below, it is possible to find such a weak order W whose 
corresponding information-theoretic lower bound ITLB is not too large compared to that of P. 

Unfortunately, computing ITLB exactly is #P-hard, because computing the number of linear 
extensions of a poset is #P-complete, a result due to Brightwell and Winkler [3]. The analysis is 
made possible because there exists a quantity, depending on the structure of the target poset, that 
can be computed in polynomial time and provides a good estimate of ITLB. This quantity is nH, 
where H denotes the entropy of the considered target poset. (The entropy of a graph is defined 
in the next section, and the entropy of a poset is defined as the entropy of its comparability 
graph.) It was Korner who introduced the notion of the entropy of a graph, in the context of 
source coding [21]. The idea of estimating an information-theoretic lower bound by means of the 
entropy of a poset was used before by Kahn and Kim in their inspiring work on sorting with partial 
information [18], see below. 

Related problems In 1971 Chambers [8] proposed an algorithm for the Partial Sorting 
problem, defined as follows: given a vector V of n numbers and a set / C {1, 2, . . . , n} of indices, 
rearrange the elements of V so that for every i e /, all elements with indices j < i are smaller or 
equal to Vi, and elements with indices j > i are bigger or equal to V*. For the indices i & I, the 
elements Vi in the rearranged vector have rank exactly i, hence this problem is also called multiple 
selection. The Partial Sorting problem is a special case of Partial Order Production in 
which the partial order is also a weak order. 

The algorithm proposed by Chambers is similar to Hoare's "find" algorithm [17], or QuickSelect. 
It has been refined and analyzed by Dobkin and Munro [13], Panholzer [25], Prodinger [26], and 
Kaligosi, Mehlhorn, Munro, and Sanders [20]. For our purposes, the key result is that of Kaligosi 
et al. [20] in which it is shown that multiple selection can be done within a lower order term of the 
information theoretic lower bound, plus a linear term. 

Another generalization of the sorting problem, called Sorting with Partial Information, 
was studied by Kahn and Kim [18]: 

Given an unknown linear order ^ on a set T = {t 1 , . . . ,t n }, together with a subset =4 of the 
relations t; t ^ tj forming a partial order, determine the complete linear order ^ by asking questions 
of the form: "is U «C tj ?". 
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This problem is equivalent to sorting by comparisons if ^ is empty. The information-theoretic 
lower bound for that problem is loge(Q), where Q := (T, =<l). The problem is complementary 
to the Partial Order Production problem in the sense that sorting by comparisons can be 
achieved by first solving a Partial Order Production problem, then solving the Sorting 
with Partial Information problem on the output. 

A proof that there exists a decision tree achieving the lower bound up to a constant factor 
has been known for some time (see in particular Kahn and Saks [19]). This is related to the 
1/3-2/3 conjecture of Fredman [14] and Linial [22]. Kahn and Kim [18] provided a polynomial 
time algorithm that finds the actual comparisons. They show that choosing the comparison that 
causes the entropy of Q to increase the most leads to a decision tree that is near-optimal in the 
above sense. 

Overview In Section [2j we study the entropy of perfect graphs. We show that it is possible 
to approximate the entropy of a perfect graph G using a simple greedy coloring algorithm. More 
precisely, we prove that any such approximation is at most H{G) + log(if(G) + 1) + 0(1), where 
H{G) denotes the entropy of graph G. 

Section [3] explains how to apply this result to solve the Partial Order Production problem 
algorithmically. We begin the section by remarking that entropy is bound to play a central role for 
the problem since nH(P) — nloge < IT LB < nH(P), where H(P) denotes the entropy of poset 
P. 

The preprocessing phase of our algorithm starts by applying the greedy coloring algorithm 
studied in Section [2] to the comparability graph of P. We then modify this coloring (we "uncross" 
the colors) in order to obtain an extension of P which is an interval order /. Another application 
of the greedy coloring algorithm, this time on the comparability graph of /, yields a weak order 
W extending J. Using our result on perfect graphs, we prove that the entropy of W is not much 
larger than that of P, that is, H(W) < H(P) + 2\og(H(P) + 1) + 0(1). 

The ordering phase of the algorithm simply runs then a multiple selection algorithm based on 
the weak order W. We use a multiple selection algorithm from Kaligosi et al. [20] that performs 
a number of comparisons close to the information-theoretic lower bound. 

We conclude the section by proving that the preprocessing complexity of our algorithm is 0(n 3 ). 

Finally, in Section |4j we discuss the number of comparisons and study the existence of an 
algorithm solving the Partial Order Production problem in ITLB + 0{n) comparisons. We 
give an example showing that such an algorithm cannot always reduce the problem to the case 
where the target poset is a weak order. More specifically, we exhibit a family of interval orders with 
entropy at most | log n, any weak order extension of which has entropy at least | log n+Q(log log n) . 

2 Entropy of Perfect Graphs 

We recall that a subset 5* of vertices of a graph is a stable set (or independent set) if the vertices 
in S are pairwise nonadjacent. Also, a graph G is perfect if uj(H) = x{H) holds for every in- 
duced subgraph H of G, where uj(H) and x(H) denote the clique and chromatic numbers of H, 
respectively. 

Let us recall similarly that the stable set polytope of an arbitrary graph G with vertex set V 
and order n is the n-dimensional polytope 

STAB(G) := conv{x S G M. v : S stable set in G}, 
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where x S is the characteristic vector of the subset S, assigning the value 1 to every vertex in S, 
and to the others. The entropy of G is defined as (see [11,21]) 

H(G) := min — —'S^\ogx v . (1) 

zeSTAB(G) n ^— ' 

For example, if G = (V, E) is the graph with V := {a, 6, c} and E := {6c}, then H(G) = 2/3 and 
the minimum in Q is attained for x = (x a ,Xb,x c ) = (1, 1/2, 1/2). 

Note that graph entropy was originally defined with respect to a given probability distribution 
on V. However, for our purposes we can take the uniform distribution, as in [18]. In this case we 
obtain Equation (py). 

An upper bound on H(G) can be found as follows: First, use the greedy coloring algorithm 
that removes iteratively a maximum stable set from G, giving a sequence Si, S2, ■ ■ ■ , Sk of stable 
sets of G. If G is perfect, this can be done in polynomial time (see, e.g., Grotschel, Lovasz and 
Schrijver [16]). Next, let x G MX be defined as 

x := — - X Sl . 
i=i 

By definition, x G STAB(G). We call any such point x a greedy point. The value of the objective 
function in the definition of H(G) for x is Y^=i ^ \§~\- We refer to the latter quantity simply 
as the entropy of x. It turns out that this gives a good approximation of H(G) when G is a perfect 
graph. 

Theorem 2. Let G be a perfect graph on n vertices and denote by g the entropy of an arbitrary 
greedy point x G STAB(G). Then 



5 <_l_( ff(G) + log i) 



for all 5 > 0, and in particular 

~g<H(G) + \og(H(G) + l) + 0(l). 

A key tool in our proof of Theorem [2] is a min-max relation of Csiszar, Korner, Lovasz, Marton, 
and Simonyi [11] relating the entropy of a perfect graph G to the entropy of its complement G: 

Theorem 3 ([11]). If G is a perfect graph on n vertices, then H(G) + H(G) = logn. 

We now turn to the proof of Theorem [2j 

Proof of Theorem^ Let Si, S2, ■ ■ ■ , Sk be the sequence of stable sets of G selected by the greedy 
algorithm (in the order the algorithm removes them). So Si is a maximum stable set in G, S2 
is a maximum stable set in G — S\, and so on. The outline of the proof is as follows: We first 
use the sets Si, S2, . . . , Sk to define a point z G R , where V is the vertex set of G. We then 
show that z belongs to the stable set polytope of the complement G of G, that is, z G STAB(G). 
Finally, we derive the desired inequality by combining the upper bound on H(G) implied by z 
with Theorem [3] 
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Fix 5 > 0. For each vertex v G V we let m = m{y) be the unique index in {1, . . . , k} such that 
v G S m . We define z by letting, for each vertex v of G, 



o I 1 \ o I n \ oil 



\%v J n W^rn(v)\ J \\Sm{v)\ 

We claim that for every stable set S of G: 

J>< L ( 2 ) 

Write the stable set S as 5* = 7\ U T 2 U ■ ■ ■ U Tt, where Tj is the ith subset of 5* taken by the 
greedy algorithm during its execution. For every v G Tx, we have S m ( v ) = S\, and |SVrt(>;)| > \S\, 
since the greedy algorithm could have selected the set S when it took S m ( v y More generally, for 
every % G {1,2, . . . ,£} and v G T i; we have (S^o)! > | jS' | — 5^i=i l^il- ^ follows in particular that 
we can enumerate the points of S as Ui, u 2 , . . . , v s in such a way that 

|S mK) | > \S\-i + l Vi G {1, 2, . . . , s}. 



We thus have 



n s [ L X 1 ' 5 X 



< 1. 

Equation (|2]) follows. 

Two classical results on perfect graphs are that the stable set polytope is completely described 
by the non-negativity and clique inequalities, that is, 

STAB(G) = {xGRy:^i„<l \/K clique in G} 

(see Chvatal [10]), and that the complement G of G is also a perfect graph (Lovasz [23]). Combining 
these two results with ^ shows that z G STAB(G). Using Theorem [3} we then deduce 

H{G) = \ogn-H(G) 

> log n + - > log z v 

n 

= \ogn + - Y^log [ - ( ^~ 

= -- — - Yl lo S %v -log \ 
n ^-^ 

= (1 - 5)<? - log -. 

Hence, g < ^ (H(G) + log |) , for all 5 > 0. By choosing 5 = 1/2 if H(G) < 1, and 5 = 
1/(H(G) + 1) otherwise, we obtain g < H(G) + log(#(G) + 1) + 0(1). □ 
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3 An Algorithm for Partial Order Production 



We denote by G(P) the comparability graph of a poset P = (V, ^p), and let H(P) := H(G(P)). 
Note that a stable set in G(P) is an antichain in P, that is, a set of mutually incomparable elements. 
Note also that G(P) is perfect, a basic result that is dual to Dilworth's theorem, see, e.g., [15]. 
The relevance of the notion of graph entropy in the context of sorting was first observed by Kahn 
and Kim [18]. Using the fact that the volume of STAB(G(P)) equals e(P)/n! (see Stanley [29]), 
they proved the following result. 

Lemma 1 ([18]). For any poset P of order n, 

—nH(P) < loge(P) — logn! < n \ogn — logn! — nH(P). 

When written as 

2~nH(P) < e (P) < 2-nH(P) _ n " 

~ n\ ~ n\ ' 

the above inequalities become intuitively clear, since 2~ nH ^ is the (maximum) volume of a box 
contained in STAB(G(P)), e(P)/n! is the volume of STAB(G(P)), and 2~ nH ^ ■ n n /n\ is the 
volume of a simplex containing STAB(G(P)). The lemma directly implies the following equality 
for every poset P: 

IT LB = log n! - log e(P) = nH(P) + O(n). (3) 

We recall that a poset is said to be a weak order whenever its comparability graph is a complete 
fc-partite graph, for some k. Such a poset W = (V, ^w) can be partitioned into k maximal 
antichains A\, . . . , A^, the layers of W, such that v <w w whenever there exist indices i and 
j such that v G Ai, w G Aj and i < j. When restricted to weak orders, the Partial Order 
Production problem resembles the Partial Sorting problem, with I = {X^=ilA?'l : * = 

1 /■•- !}• ' ^ 

Our key idea is to show that, using (twice) the greedy coloring algorithm presented in the 
previous section, we can efficiently extend^ the given poset P to a weak order W whose entropy 
is close to that of P. The reason why we have to use twice the greedy algorithm is that the 
obtained coloration might not be "ordered" (might not represent the stable sets of a weak order). 
However, we describe below how to uncross this coloring in order to extend P to an interval order 
without increasing too much the entropy. We show that applying our greedy coloring to an interval 
order provides an "ordered" coloring, which allows us to run a second time our greedy algorithm, 
providing an extension which is a weak order. 

We then simply run an efficient multiple selection procedure, with W as input. We show 
that, because replacing P by W does not increase the entropy too much, the resulting number of 
comparisons is close to IT LB. 

The preprocessing phase is composed of three steps, each of which can be performed in poly- 
nomial time. In the first step, we apply the greedy coloring procedure to G(P), to obtain a greedy 
point x. This step makes use of an auxiliary network defined from P. Then, in the second step, 
using again the auxiliary network, we extend P to an interval order / whose entropy is not larger 
than that of x. This allows us to "uncross" the antichains used in x. (An alternative way of 
obtaining the interval order / is to apply Kahn and Kim's [18] laminar decomposition lemma to 
x.) Finally, in the third step, we apply the greedy coloring procedure again, this time on G(I), to 
obtain the weak order W. See Figure [2] for an illustration of steps 1 and 2. 

2 A poset Q extends a poset P if they have the same ground set V and v <p w implies v w, for all v, w e V. 
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(b) Network D and 
potential y. 
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(c) Interval representation 
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Figure 2: Obtaining an interval order extension of the poset P. 

Auxiliary network Let P = (V, ^p) be any poset. We say that v is covered by w in P if 
v w, v w and v ^p z ^p w implies z = v or z = w. The Hasse diagram of P is the network 
with node set V, and arc set {(v, w) : v is covered by w in P}. An element v of P is minimal 
(resp. maximal) if 2 ^p v (resp. v ^p z) implies z — v. 

We construct a network P = P(P) from the Hasse diagram of P by first uncontracting each 
element v G V to an arc (v~,v + ) and then adjoining a source node s sending an arc to each 
minimal element, and a sink node t receiving an arc from each maximal element. The resulting 
network has node set 

N(D) := {s, t}U {v- : v EV}U {v + : v E V} 

and arc set 

A(D) := {(s, v~) : v G V, v minimal in P} U {(v ~ , v + ) : v G V} U 

{{v + , w~) : v is covered by w in P} U {(v + , t) : v G V, f maximal in P}. 

This network gives a useful characterization of points in the stable set polytope of the comparability 
graph of P, as is explained in the next lemma. 

Lemma 2. Let P be a poset with ground set V , let G := G(P) and D := D(P). A vector x G M. v 
belongs to STAB(G) if and only if there exists a vector y G R JV ( D ) (called a potential ) such that 
y s = 0, y t = 1, y is nondecreasing along arcs of D, and y v + — y v - = x v for all v G V. 

Proof. Again, we use (see Chvatal [10]): 

STAB(G) = {x G Rl : x v ^ 1 yK clic l ue in °}- 

We first show sufficiency Let x G M. v be a vector that admits a potential y G M. N ^ D \ Consider 
any chain C = {v 1, t> 2 , • • ■ , v c } in P with vi ^p v 2 ^p • • • ^p v c (cliques in G correspond to chains 
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in P). Then 

Yl Xv = ( y 4 ~ ) + " ' + ( y v+ ~ y vc ) 
vec 

< (Vv- ~ Vs) + (Vv+ ~ Vv-) + (Vv- ~ Vv+) + ■ ■ ■ + (Vv+ ~ Vvc) + (Vt ~ Vv+) 

= yt-y s = i- 

It follows that x G STAB(G). 

For necessity, consider x G STAB(G). For v G V, we let y v + be the maximum total weight of a 
chain of P whose maximum with respect to is v , when each vertex w is given the weight x w , 
and y v - := y v + — x v . Then we let y s := and y t := 1. As is easily verified, y is a potential for 
x. □ 

It follows that H(P) is the optimum value of the following convex minimization problem with 
a polynomial number of variables and constraints: 

(H-potential) min log x v 

n z — ' 

s.t. x v = y v + - y v - \/v eV 

y v < y 9 v(p, ?) g 

Vs = o 
?/* = i. 

We remark that this formulation shows that H(P) can be computed to within any fixed precision 
in strongly polynomial time, using interior point methods (see for instance [24]). However, ap- 
proximating H(P) using a greedy point will be enough for our purposes, and will moreover give a 
better upper bound on the complexity of our algorithm. 



Greedy extensions Let x be a greedy point in STAB(G), as defined in Section [2j Consider 
the potential y G WL N ( D ^ defined from x as in the proof of Lemma [2] For v G V, we let y v + be 
the maximum (total) weight of a chain of P ending in v, where each vertex w has weight x w , and 
Vv- '■= Vv+ — %v Let also y s := and y t := 1. 

From this potential y, we compute an interval order / extending P whose entropy is not larger 
than that of x. The ground set of I is V. We let v ^/ w whenever y v + < y w -. Thus the open 
intervals (y v -,y v +) (for v G V) provide an interval representation of I. Because v w implies 
Vv+ < Vw-i which in turn implies v ^/ w, the interval order / extends P. The entropy of I is not 
larger than that of x because (x, y) remains feasible for the minimization problem (H-potential) 
defined above, after P is replaced by /. 

Apply again the greedy coloring algorithm, but now on G(I). Let Ax, Ak denote the 
antichains of / produced by the greedy coloring algorithm. Because I is an interval order, we 
can find a permutation a of {1, . . . , k} such that v <j w, v G Ar(i) an d w G A a ^ imply i < j. 
Thus, the weak order W with ground set V obtained by setting v <w w whenever v G A a ^ and 
w G A a (j) with % < j is an extension of /. Such a weak order W is said to be a greedy extension of 
the original poset P. 

Lemma 3. Let P be a poset and W one of its greedy extensions. Then 

H(W) < ^-(h{P) + 2\og- 5 + 2 
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for all 5 > 0, and in particular 



H{W) < H{P) + 2 log(#(P) + 1) + 0(1). 

Proof. Let 5' := 5/2. Let / denote the intermediate interval order used to obtain W. Theorem [2] 
implies 

H(P)>(l-5')H(I)-\og(l/5') 

> (1 - 6') ((1 - S')H(W) - log(l/5')) - log(l/5') 

> (1 - 5)H(W) - 21og(l/5) - 2. 

In addition to Theorem [2j for the first inequality we used the fact that H(I) < g, and for the 
second one, we used the fact that the greedy coloring of / directly gives the unique decomposition 
of W in maximal stable sets. This shows the first part of the claim. For the second part, again 
take 5 = 1/2 if H(P) < 1, and 5 = 1/(H(P) + 1) otherwise. □ 

Algorithm and complexity The above results directly suggest the following algorithm: com- 
pute a greedy extension W of P, and run a multiple selection procedure on T with respect to W. 
In terms of the number of comparisons between elements of T, we only incur a controlled penalty. 

Theorem 4. The Partial Order Production problem can be solved in polynomial time using 
at most 

ITLB + o(ITLB) + 0{n) (4) 
comparisons between elements of T in the worst case. 

Proof. The weak order extension W can be computed in polynomial time. Let us denote by A\, 
. . . , Ak its layers. We run the multiple selection algorithm on the elements of T, with the ranks 
r.j := J2]=i I A? I (f° r i = 1, ■ ■ ■ ,k — 1). Kaligosi et al. [20] give a multiple selection algorithm that 
requires only B + o(B) + 0(n) comparisons in the worst case, where B := logn! — loge(W) is the 
information-theoretic lower bound for W. Thus 



B = nH(W) + 0(n) (from Eqn. (g 

< nH(P) + 2n \og(H(P) +l)+0(n) (from Lemma [3 



(ITLB \ 
+ 1 J +0(n) (from Eqn. g 

= ITLB + o(ITLB) + 0(n). 

Hence B + o(B) + 0(n) = ITLB + o(ITLB) +0(n), and the theorem follows. □ 

We conclude the section by discussing the preprocessing complexity of our algorithm. 

The first execution of the greedy coloring algorithm can be done in time 0(mn), where m 
is the number of arcs in the network D := D(P) (notice m > n and m = 0(n 2 )), as we now 
briefly explain. The algorithm finds maximal antichains in the graph by decrementing a flow on 
the auxiliary network. This flow has to satisfy lower bounds on the arcs. 

Let X := 0, i := 1, and put a lower bound of i a := 1 on each arc a of the form (v~,v + ) with 
v G V, of l a := on every other arc a of D. Start with an arbitrary integer s—t flow of value 
n such that <p a > £ a for every arc a G A(D). Let Y be the set of nodes of D that can be reached 
from s following a decrementing path, namely, a path v Vi . . . v k with v := s such that, for every 
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i G {1, 2, . . . , fc}, either (t>j_i, Uj) G and 0(^-1,^) > A^-i^)' or ( v *> G ^(-°)- Now ' tnere 

are two cases: (1) t G Y. Thus there exists a decrementing s-t path. We then decrement by 1 the 
flow value of <fi using the latter path. (2) t ^ Y. Observe that no arc of D enters the set Y and 
that the arcs a going out of Y satisfy (fi a = £ a . It follows that 

A := {v G V I (tT,*; + ) G = 1} 

is an antichain of P — X. (Here, S + (Y) denotes the set of arcs of D going out of Y.) Moreover, 
since the flow value of <fi equals \ Ai\, the antichain Ai is maximum among the antichains of P — X. 
This is because, by definition of our lower bounds, the flow value is at least \A\, for every antichain 
A contained in P — X. We then let £t v - )V +) '■— for every v G Ai, set X : = X U Ai, increment i 
by 1, and repeat the above steps, until X — V. Computing the set Y, decrementing the flow, and 
finding the antichain Ai are steps that can be done in time 0(m). Since we go through the main 
loop at most 2n times, this implementation of the greedy algorithm runs in time 0(nm). 

The greedy point x can be computed in time 0(n). The corresponding potential y can be found 
in 0(m) using a simple dynamic program. The second execution of the greedy coloring algorithm 
can be done in time 0(n 2 ), using the fact that the comparability graph of the interval order J is a 
co-interval graph. Finally, a bound on the complexity of the multiple selection procedure is 0(n 2 ). 
So the whole algorithm runs in 0(nm) = 0(n 3 ). 

4 Tightness 

A natural question is whether there exists an algorithm for Partial Order Production that 
does at most IT LB + 0(n) comparisons between elements of T. We show in this section that 
every algorithm that first extends the target poset to a weak order and then solves the problem 
on the weak order can be forced to make IT LB + fi(nloglogn) comparisons, both in the worst 
case and the average case. This is a consequence of the following theorem: 

Theorem 5. There exists a constant c > such that, for all n > 1, there is a poset P on n 
elements satisfying H(W) > H(P) + c log log n for every weak order W extending P. 

In order to prove Theorem |5j we define a family {Gk} (k > 1) of interval graphs inductively as 
follows: 

• Gi consists of a unique vertex, and 

• for k > 2, the graph G^ is obtained by first taking the disjoint union of K 2 k-i (the "central 
clique") with two copies of Gk-i, and then making half of the vertices of the central clique 
adjacent to all vertices in the first copy, and the other half to all those in the second copy. 

It is easily seen that Gk is indeed an interval graph, as is suggested in Figure [3j The graph Gk 
has fc2 fc_1 vertices. The complement Gk of Gk is the comparability graph of the interval order Ik 
defined by an interval representation of Gk- 

Lemma 4. H(I k ) < (k + l)/2. 

Proof. By construction, the maximal stable sets of the graph Gk all have 2 fc_1 elements, and there 
are 2 h — 1 such maximal stable sets. We define a point x^ k > of the stable set polytope STAB(Gfe) 
as follows: 

2 k -l 

• Z-^i 2 k — 1 ' 

i=l 
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Figure 3: An interval representation of G4 (colors highlight the different levels of the construction). 



where Si, S 2 , ■ ■ ■ , S^-i are the maximal stable sets of Gk- Observe that, for every £ e {0, . . . , k—1}, 
there are 2 fc_1 vertices in Gk that belong to exactly 2 l different maximal stable sets (that is, there 
are 2 k ~ 1 intervals of each different length in the interval representation suggested in Figure [3J. We 
thus obtain the following upper bound on the entropy of Ik- 



k-l 



m) <-i^E 2fc_1 lo s = lo s ( 2fc - !) - V - {k + 1)/2 - 

£=0 

The lemma follows. □ 

We proceed by showing that every weak order extension of Ik has relatively large entropy 
compared to We first introduce some definitions. Consider an arbitrary graph G and a coloring 
Ci,...,Cg of its vertices. Similarly as how greedy points are defined (see Section [2]), one can 
associate an entropy to the latter coloring, namely, the entropy of the probability distribution 
{\Ci\ /n}i = i,.../ 

E CJ , \Cj\ 
log 
n 



n n 



The minimum entropy of a coloring is known as the chromatic entropy of G, and is denoted by 
H X (G). The chromatic entropy can be thought of as a constrained version of the graph entropy, 
in which the stable sets involved in the definition of H(G) are required to form a partition of the 
vertices of G. 

Lemma 5. Let G be the comparability graph of a poset P. Then any weak order extension W of 
P has entropy H(W) > H X (G). 

Proof. The maximal antichains of W are pairwise disjoint, hence they correspond to a coloring of 
G. The entropy of W is equal to the entropy of the latter coloring, and thus is at least H X (G). □ 



Lemma |5j suggests finding a (good) lower bound on H x {Gk), the chromatic entropy of Gk- To 
achieve that, we make use of the following result of [4] (see Corollary 1 in that paper). 

Theorem 6 ([4]). Let G be an arbitrary graph. Then the entropy of any coloring of G produced 
by the greedy coloring algorithm is at most H X (G) + loge. 

We can therefore restrict ourselves to analyzing the entropy of greedy colorings of Gk- Recall 
that all maximal stable sets in Gk have the same cardinality 2 fc ~ 1 . Consider the greedy coloring of 
Gk defined recursively as follows: take first the stable set of cardinality 2 fc_1 that corresponds to 
the central clique in Gk, and then, if k > 2, recurse on the two copies of Gk-i that are left. Let 
denote the entropy of the resulting coloring of Gk- 

Lemma 6. cjk = (k — l)/2 + log A;. 
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Proof. The greedy coloring defined above consists of 2 % 1 color classes of cardinality 2 k 1 , for 
i — 1,2, ... ,k. Hence, its entropy is 

^ ' 2 k ~ ' 

i=l 

lA, A;2 fc - 1 



i=l 
k 



k 
i=i 

= log ft H — , 

as claimed. □ 

We may now turn to the proof of Theorem [5] 

Proof of Theorem^ Let k > 1 and consider the interval order defined above, of order n := 
k2 k ~ l . Let also W be an arbitrary weak order extending Ik- Combining Lemmata |4[ [5] and [6] with 
Theorem [6] gives 

H(W) - H{I k ) > H x (G k ) - H(I k ) 

fk-l , , , \ fc + 1 
> I — h log fc - log e J — 

= log k — log e — 1 
= fi(loglogn), 

as claimed. □ 
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